-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add TableWriterRepalyer #11100
Add TableWriterRepalyer #11100
Conversation
✅ Deploy Preview for meta-velox canceled.
|
02877f4
to
63e9547
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@duanmeng overall looks good. Can you add e2e test for this? Thanks!
a5f4023
to
42091fd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@duanmeng looks good overall. Some minors
core::PlanNodePtr createPlan() const override; | ||
|
||
private: | ||
const std::string targetDir_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we name traceDir_ to be consistent with naming in other places?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
targetDir_
is the output directory when we replay the TableWriter
. Rename it to replayOutputDir_
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed it to replayOutputDir_
const auto traceRoot = fmt::format("{}/{}", rootDir_, taskId_); | ||
return PlanBuilder() | ||
.traceScan( | ||
fmt::format("{}/{}", traceRoot, nodeId_), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
traceNodeDir as name for traceScan input? thanks!
const std::vector<std::string>& partitionKeys, | ||
const RowTypePtr& rowType) { | ||
ASSERT_EQ(actualDirs.size(), expectedDirs.size()); | ||
auto iterActual = actualDirs.begin(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/iterActual/actualDirIt/
s/iterExpected/expectedDirIt/
@xiaoxmeng has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
42091fd
to
67543c8
Compare
0d15d96
to
ae33331
Compare
ae33331
to
7e77507
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@duanmeng thanks for the update. LGTM!
@@ -31,17 +33,46 @@ DEFINE_string( | |||
task_id, | |||
"", | |||
"Specify the target task id, if empty, show the summary of all the traced query task."); | |||
DEFINE_string(node_id, "", "Specify the target node id."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still need these flags?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, there may be multiple traced node dir in $rootDir/$taskId.
7e77507
to
0ca9dbb
Compare
0ca9dbb
to
9ab50f3
Compare
@xiaoxmeng has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@xiaoxmeng has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@xiaoxmeng merged this pull request in a933331. |
Adds
TableWriterReplayer
to facilitate the replaying ofTableWriter
operator.Uses the given plan node ID to find the traced
TableWriteNode
from the traced plan.It helps create a new
TableWriterNode
and rebuild a query plan with aQueryTraceScanNode
,then apply the traced configurations, and rerun.
QueryTraceScanNode
holds the traced data type and dir for a given plan node ID.These information can be utilized to build the
QueryTraceScan
operator. It createsa
QueryDataReader
using the traced data type and input data file. To find the rightinput data file for replaying, we need to use both the pipeline ID and driver ID, which
are only known during operator creation, so we need to figure out the input traced
data file and the output type dynamically.
Part of #9668